Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support per-core state using #[thread_local] #794

Closed
wants to merge 4 commits into from

Conversation

jsgf
Copy link
Contributor

@jsgf jsgf commented Apr 15, 2024

  • Controlled by thread_local feature
  • Does not require nightly features, but using it does
  • Intercepts __pre_init to copy .tdata into the per-core state
  • Implements __aeabi_read_tp to returns per-core state for code compiler generates when accessing thread_local variables
  • Needs linker script support to set up the layout and symbols

This is based on picolibc's support for TLS; its the only example I found of it for rp2040.

Prototype for #793

@jsgf jsgf force-pushed the multicore-thread-local branch 5 times, most recently from bbf7b2b to a8768cb Compare April 15, 2024 05:28
- Controlled by `thread_local` feature
- Does not require nightly features, but using it does
- Intercepts `__pre_init` to copy .tdata into the per-core state
- Implements `__aeabi_read_tp` to returns per-core state for code compiler
  generates when accessing `thread_local` variables
- Needs linker script support to set up the layout and symbols

This is based on
[picolibc](https://github.com/picolibc/picolibc/blob/58d6157cc2135df5043d62c3e89feedc20ffcd57/newlib/libc/picolib/machine/arm/read_tp.S#L71)'s
support for TLS; its the only example I found of it for rp2040.
@jannic
Copy link
Member

jannic commented Apr 15, 2024

A few random thoughts:

  • How does this interact with re-spawning the second core? core1.spawn() is not marked unsafe and could be called multiple times. While that's probably not a good idea in most cases, it shouldn't be unsound.
  • Please provide a usage example and some documentation. Even if this is still a prototype, knowing how it is meant to be used makes it easier to think about.
  • Would it be useful to provide a hook to chain another __pre_init function somehow? Not sure how it could look like, and there are not too many use cases for __pre_init functions, but if you want to combine thread_local with something else that needs it, both features just defining the __pre_init symbol would clash.
  • There are discussions about deprecating static mut in rust. I think the current consensus is that only references to static mut are going to be banned. But I'm not following the discussion closely.

@jsgf
Copy link
Contributor Author

jsgf commented Apr 15, 2024

@jannic Thanks for your comments.

How does this interact with re-spawning the second core? core1.spawn() is not marked unsafe and could be called multiple times. While that's probably not a good idea in most cases, it shouldn't be unsound.

There's no problem with doing this in principle. The per-core data has the same semantics as regular static data/bss - it will be in the same state when you re-spawn on the core as it was before. This is a little different from traditional per-thread data, which will be re-initialized each time you start a thread. But I think since the core is a persistent entity, this current behaviour makes sense.

That said, it wouldn't be too hard to reinitialize the per-core data on spawn.

I don't think there's any additional soundness issues here. If the per-thread/core data is static mut then it will be subject to the normal safety requirements that mutable statics need.

Please provide a usage example and some documentation

Will do.

Would it be useful to provide a hook to chain another __pre_init function somehow? Not sure how it could look like, and there are not too many use cases for __pre_init functions, but if you want to combine thread_local with something else that needs it, both features just defining the __pre_init symbol would clash.

Yeah I was thinking about this. I really just need a hook to initialize the per-core data before the core 0 main comes up. I think it would make sense to add a dedicated hook in the cortex-m-rt startup code for this, since in principle this could apply to any multicore implementation.

Otherwise I don't have any great ideas for how to add a chainable __pre_init, unless it's something like ctors.

There are discussions about deprecating static mut in rust. I think the current consensus is that only references to static mut are going to be banned. But I'm not following the discussion closely.

What's the context here? Are you concerned about this implementation's use of static mut or just the relationship between static mut and this feature in general? I haven't seen anything about deprecating it in general, but I'd be interested in pointers if you have them. I don't see a way of doing without it, esp for embedded uses, unless there's some alternative mechanism which gives access to non-stack, non-heap preallocated memory.

@jannic
Copy link
Member

jannic commented Apr 16, 2024

What's the context here?

For example https://internals.rust-lang.org/t/pre-rfc-deprecate-then-remove-static-mut/20072/106. Or rust-lang/rust#117556. I didn't follow it in detail, so your usage could be unrelated to the planned / discussed changes.

@jsgf
Copy link
Contributor Author

jsgf commented Apr 18, 2024

@jannic I see. The proposal is literally about static mut, not the general concept. We can use SyncUnsafeCell instead (or whatever mechanism is introduced).

@jannic
Copy link
Member

jannic commented Apr 18, 2024

Exactly. And I don't think they'll break existing code, at least not without a long grace period. So using static mut now is not out of question. It's just that I'd prefer to use a different mechanism from the start if we introduce a new feature now. Of course, only if such a mechanism already exists.

rp2040-hal/src/multicore.rs Outdated Show resolved Hide resolved
rp2040-hal/src/multicore.rs Outdated Show resolved Hide resolved
rp2040-hal/src/multicore.rs Outdated Show resolved Hide resolved
- for __eabi_read_tp, simplify to just directly load TLS_CORE_[01]
  as needed without the need for an indirection. This only uses r0
  and doesn't touch the stack.
- Write the tdata copy in Rust, being careful to only use raw pointers.
  The generated asm is functionally identical to the hand-written asm.
@jsgf
Copy link
Contributor Author

jsgf commented Apr 21, 2024

Question: How can I config ci to handle a nightly only feature like this?

Aside from that I'm still working on testing so it's not ready for merging.

@ithinuel
Copy link
Member

We only merge stable compatible code, so it’s not something we really have to deal with.

@ithinuel ithinuel self-requested a review April 21, 2024 19:22
@jsgf
Copy link
Contributor Author

jsgf commented Apr 21, 2024

Ok so this is in the back burner until thread local is stabilized? I can probably just publish it as a separate crate, since multicore doesn't actually need to change to support this.

@jsgf
Copy link
Contributor Author

jsgf commented Apr 22, 2024

OK I've published this functionality as a separate crate:

So we can close this for now, and restart the conversation when #[thread_local] is stabilized.

@jsgf jsgf closed this Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants